Adding Dockerfile and scripts for building and starting #457

Paladinium · 2024-12-13T21:31:07Z

A PR for a first draft of docker as discussed in #378

A few notes:

Not sure whether it works on Windows
Not sure whether it works just with CPUs (I am using it with a GPU)
A readme MD was added. This is to work on the documentation until Docker turns out to be stable enough to move the docs to your wiki. For the time being, I would appreciate to keep the MD file to make quick changes.
Did not have time to check whether gradio works 100%. The UI in general works though. Please: let this PR pass and I will have a look into it in case there are issues.
The image is not yet hardened. This means that the python application probably runs as root, which is a no-go for usage in production environments -> fix later

What is important:

DOCKER_README.md explains all the optional arguments to the scripts in detail. I recommend you read this one first.
A new folder docker was introduced
- conda is a subfolder with docker files building an environment.yml file. This file is used in all the other docker builds to ensure that the environments are identical. This is also what is doing the trick to ensure that DeepSpeed's environment is compatible with alltalk's environment.
- deepspeed is a subfolder to build DeepSpeed. It uses the conda environment file mentioned.
- versions.sh lists important variables. This should make it even simpler to bump versions.
  - As a sidenote, the versions still are duplicated in the Dockerfiles, but might be removed
The toplevel Dockerfile is about building alltalk itself. It uses the conda environment file mentioned.
- There are wrapper scripts docker-build.sh and docker-start.sh that make it super simple to use it. If you want to understand the whole magic, those 2 files are a good start.
- docker-build.sh internally makes sure that the conda environment and DeepSpeed are built - no need to invoke them manually.

Less important, but still noteworthy:

There is no need to pick the pytorch version explicitly. It is chosen based on pytorch-cuda. However:
- pytorch currently only supports CUDA 12.4. It's not possible to go vor CUDA 12.6 at this moment. See https://pytorch.org/get-started/locally/
- With CUDA 12.4, pytorch would actually be v2.5.1. However, when installing ffmpeg, ffmpeg comes with other dependencies which downgrade pytorch to just v2.4.0. I did not research which dependencies caused this and whether another implementation of ffmpeg would be better.
- When trying to force pytorch to be v2.5.1, the build swaps from GPU to CPU
- I think it's not a problem, but good to know in case you are wondering.

…nment

Paladinium · 2024-12-20T20:27:45Z

@erew123 : This PR is ready for you to be checked/tested.

erew123 · 2024-12-20T21:42:13Z

Which I think covers everything needed at the moment. Will pull the PR in.

Adding Dockerfile and scripts for building and starting

9a99126

Paladinium force-pushed the alltalkbeta_docker branch from 951bf89 to 9a99126 Compare December 16, 2024 19:34

Paladinium added 2 commits December 19, 2024 11:09

Adding docker build for DeepSpeed

7522fa8

Fixing docker build for DeepSpeed, introducing build for conda enviro…

c7f9b04

…nment

Fixing a typo in the readme

78b0a51

erew123 merged commit 5a18d68 into erew123:alltalkbeta Dec 20, 2024
1 check passed

Provide feedback